Approaches to Regularized Regression – A Comparison between Gradient Boosting and the Lasso

Tobias Hepp; Matthias Schmid; Olaf Gefeller; Elisabeth Waldmann; Andreas Mayr

doi:10.3414/ME16-01-0033

RSS-Feed abonnieren

Bitte kopieren Sie die angezeigte URL und fügen sie dann in Ihren RSS-Reader ein.

https://www.thieme-connect.de/rss/thieme/de/10.1055-s-00035037.xml

Teilen / Bookmarken

Facebook Linkedin Weibo

PDF herunterladen

Methods Inf Med 2016; 55(05): 422-430
DOI: 10.3414/ME16-01-0033

Original Articles

Georg Thieme Verlag KG Stuttgart · New York

Approaches to Regularized Regression – A Comparison between Gradient Boosting and the Lasso^[*]

Tobias Hepp

¹Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

,

Matthias Schmid

²Institut für medizinische Biometrie, Informatik und Epidemiologie, Rheinische Friedrich-Wilhelms-Universität Bonn, Germany

,

Olaf Gefeller

¹Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

,

Elisabeth Waldmann

¹Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

,

Andreas Mayr

¹Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany

²Institut für medizinische Biometrie, Informatik und Epidemiologie, Rheinische Friedrich-Wilhelms-Universität Bonn, Germany

› Institutsangaben
FundingsThe work on this article was supported by the German Research Foundation (DFG), grant SCHM 2966/1–2 and the Interdisciplinary Center for Clinical Research (IZKF) of the Friedrich-Alexander-University Erlangen-Nürnberg (Project J49).

Weitere Informationen

Publikationsverlauf

Received 11. März 2016

Accepted in revised form: 21. Juni 2016

Publikationsdatum:
08. Januar 2018 (online)

Abstract
Volltext
Referenzen
Zusatzmaterial

Lizenzen und Reprints

Summary

Background: Penalization and regularization techniques for statistical modeling have attracted increasing attention in biomedical research due to their advantages in the presence of high-dimensional data. A special focus lies on algorithms that incorporate automatic variable selection like the least absolute shrinkage operator (lasso) or statistical boosting techniques. Objectives: Focusing on the linear regression framework, this article compares the two most-common techniques for this task, the lasso and gradient boosting, both from a methodological and a practical perspective. Methods: We describe these methods highlighting under which circumstances their results will coincide in low-dimensional settings. In addition, we carry out extensive simulation studies comparing the performance in settings with more predictors than observations and investigate multiple combinations of noise-to-signal ratio and number of true non-zero coeffcients. Finally, we examine the impact of different tuning methods on the results. Results: Both methods carry out penalization and variable selection for possibly highdimensional data, often resulting in very similar models. An advantage of the lasso is its faster run-time, a strength of the boosting concept is its modular nature, making it easy to extend to other regression settings. Conclusions: Although following different strategies with respect to optimization and regularization, both methods imply similar constraints to the estimation problem leading to a comparable performance regarding prediction accuracy and variable selection in practice.

Keywords

Penalization - lasso - regularization - boosting - variable selection - high-dimensional data

^* Supplementary material published on our web-site http://dx.doi.org/10.3414/me16-01-0033

Online Supplementary Material

Online Supplementary Material File 2

References
1 Saeys Y, Inza In, Larrañaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007; 23 (Suppl. 19) 2507-2517.

MissingFormLabel
PubMed Suche in Google Scholar
2 Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society (Series B) 1996; 58: 267-288.

MissingFormLabel
PubMed Suche in Google Scholar
3 Bühlmann P, Hothorn T. Boosting algorithms: regularization, prediction and model fitting. Stat Sci 2007; 22 (Suppl. 04) 477-505.

MissingFormLabel
PubMed Suche in Google Scholar
4 Mayr A, Binder H, Gefeller O, Schmid M. The evolution of boosting algorithms. From machine learning to statistical modelling. Meth Inf Med 2014; 53 (Suppl. 06) 419-427.

MissingFormLabel
PubMed Suche in Google Scholar
5 Friedman J, Hastie T, Tibshirani R. Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors). Ann Statist 2000; 28 (Suppl. 02) 337-407.

MissingFormLabel
PubMed Suche in Google Scholar
6 Ridgeway G. The state of boosting. Computing Science and Statistics 1999; 31: 172-181.

MissingFormLabel
PubMed Suche in Google Scholar
7 Hothorn T. Boosting – an unusual yet attractive optimiser. Meth Inf Med 2014; 53 (Suppl. 06) 417-418.

MissingFormLabel
PubMed Suche in Google Scholar
8 Hastie T, Tibshirani R, Friedman J. The elements of statistical learning. Springer Series in Statistics. New York (NY): Springer; 2001

MissingFormLabel
Suche in Google Scholar
9 Bühlmann P, Gertheiss J, Hieke S, Kneib T, Ma S, Schumacher M. et al. Discussion of The evolution of boosting algorithms and Extending statistical boosting. Meth Inf Med 2014; 53 (Suppl. 06) 436-445.

MissingFormLabel
PubMed Suche in Google Scholar
10 Hoerl AE, Kennard RW. Ridge regression: applications to nonorthogonal problems. Technometrics 1970; 12: 69-82.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
11 Breiman L. Better subset regression using the nonnegative garrote. Technometrics 1995; 37 (Suppl. 04) 373-384.

MissingFormLabel
PubMed Suche in Google Scholar
12 Knight K, Fu W. Asymptotics for lasso-type estimators. Ann Statist 2000; 28 (Suppl. 05) 1356-1378.

MissingFormLabel
PubMed Suche in Google Scholar
13 Greenshtein E, Ritov Y. Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 2004; 10 (Suppl. 06) 971-988.

MissingFormLabel
PubMed Suche in Google Scholar
14 Bunea F, Tsybakov A, Wegkamp M. Sparsity oracle inequalities for the Lasso. Electron J Stat 2007; 1: 169-194.

MissingFormLabel
PubMed Suche in Google Scholar
15 van de Geer SA. High-dimensional generalized linear models and the lasso. Ann Statist 2008; 36 (Suppl. 02) 614-645.

MissingFormLabel
PubMed Suche in Google Scholar
16 Meinshausen N, Bühlmann P. High-dimensional graphs and variable selection with the Lasso. Ann Statist 2006; 34 (Suppl. 03) 1436-1462.

MissingFormLabel
PubMed Suche in Google Scholar
17 Zhao P, Yu B. On Model Selection Consistency of Lasso. J Mach Learn Res 2006; 7: 2541-2563.

MissingFormLabel
PubMed Suche in Google Scholar
18 Zou H. The Adaptive Lasso and Its Oracle Properties. J Am Stat Assoc 2006; 101 (Suppl. 476) 1418-1429.

MissingFormLabel
PubMed Suche in Google Scholar
19 Wainwright MJ. Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery UsingConstrained Quadratic Programming (Lasso). IEEE Trans Inf Theory 2009; 55 (Suppl. 05) 2183-2202.

MissingFormLabel
PubMed Suche in Google Scholar
20 Zhang CH, Huang J. The sparsity and bias of the Lasso selection in high-dimensional linear regression. Ann Statist 2008; 36 (Suppl. 04) 1567-1594.

MissingFormLabel
PubMed Suche in Google Scholar
21 Meinshausen N, Yu B. Lasso-type recovery of sparse representations for high-dimensional data. Ann Statist 2009; 37 (Suppl. 01) 246-270.

MissingFormLabel
PubMed Suche in Google Scholar
22 Zhang T, Yu B. Boosting with early stopping: Convergence and consistency. Ann Statist 2005; 33 (Suppl. 04) 1538-1579.

MissingFormLabel
PubMed Suche in Google Scholar
23 Bühlmann P. Boosting for high-dimensional linear models. Ann Statist 2006; 34 (Suppl. 02) 559-583.

MissingFormLabel
PubMed Suche in Google Scholar
24 Mayr A, Hofner B, Schmid M. The importance of knowing when to stop – a sequential stopping rule for component-wise gradient boosting. Meth Inf Med 2012; 51 (Suppl. 02) 178-186.

MissingFormLabel
PubMed Suche in Google Scholar
25 Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. Ann Statist 2004; 32: 407-499.

MissingFormLabel
PubMed Suche in Google Scholar
26 Meinshausen N, Rocha G, Yu B. Discussion: a tale of three cousins: Lasso, l2boosting and Dantzig. Ann Statist 2007; 35 (Suppl. 06) 2373-2384.

MissingFormLabel
PubMed Suche in Google Scholar
27 Duan J, Soussen C, Brie D, Idier J, Wang YP. On lars/homotopy equivalence conditions for over-determined lasso. IEEE Signal Process Lett 2012; 19 (Suppl. 12) 894-897.

MissingFormLabel
PubMed Suche in Google Scholar
28 Hastie T, Taylor J, Tibshirani R, Walther G. Forward stagewise regression and the monotone lasso. Electro J Stat 2007; 1: 1-29.

MissingFormLabel
PubMed Suche in Google Scholar
29 Binder H, Schumacher M. Adapting Prediction Error Estimates for Biased Complexity Selection in High-Dimensional Bootstrap Samples. Stat Appl Genet Mol Biol 2008; 7 (Suppl. 01) 1-28.

MissingFormLabel
PubMed Suche in Google Scholar
30 R Core Team.. R: A Language and Environment for Statistical Computing. Vienna, Austria: 2014 Available from: http://www.R-project.org.

MissingFormLabel
PubMed Suche in Google Scholar
31 Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J Stat Softw 2010; 33 (Suppl. 01) 1-22.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
32 Hothorn T, Buehlmann P, Kneib T, Schmid M, Hofner B. mboost: Model-Based Boosting. 2015 R package version R package version 2.5–0. Available from: http://CRAN.R-project.org/pack-age=mboost.

MissingFormLabel
PubMed Suche in Google Scholar
33 Scheipl F, Kneib T, Fahrmeir L. Penalized likelihood and Bayesian function selection in regression models. Advances in Statistical Analysis 2013; 97 (Suppl. 04) 349-385.

MissingFormLabel
PubMed Suche in Google Scholar
34 Wang Z, Wang C. Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data. Stat Appl Genet Mol Biol 2010; 9 (Suppl. 01) 1-33.

MissingFormLabel
PubMed Suche in Google Scholar
35 Friedman J. Greedy function approximation: a gradient boosting machine. Ann Statist 2001; 29 (Suppl. 05) 1189-1232.

MissingFormLabel
Crossref PubMed Suche in Google Scholar
36 Bühlmann P, Yu B. Sparse Boosting. J Mach Learn Res 2006; 7: 1001-1024.

MissingFormLabel
PubMed Suche in Google Scholar
37 Seibold H, Bernau C, Boulesteix AL, Bin RD. On the choice and influence of the number of boosting steps. 2016 Available from: http://nbn-resolving.de/urn/resolver.pl?urn=nbn:de:bvb:19-epub26724–1.

MissingFormLabel
PubMed Suche in Google Scholar
38 Harris N, Sepehri A. The Accessible Lasso Models. 2015 Available from: http://arxiv.org/abs/1501.02559.

MissingFormLabel
PubMed Suche in Google Scholar
39 Fenske N, Kneib T, Hothorn T. Identifying risk factors for severe childhood malnutrition by boosting additive quantile regression. J Am Stat Assoc 2011; 106 (Suppl. 494) 494-510.

MissingFormLabel
PubMed Suche in Google Scholar
40 Ma S, Huang J. Regularized ROC method for disease classification and biomarker selection with microarray data. Bioinformatics 2005; 21 (Suppl. 24) 4356-4362.

MissingFormLabel
PubMed Suche in Google Scholar
41 Schmid M, Hothorn T. Boosting additive models using component-wise P-splines. Comput Stat Data Anal 2008; 53: 298-311.

MissingFormLabel
PubMed Suche in Google Scholar
42 Sobotka F, Kneib T. Geoadditive expectile regression. Comput Stat Data Anal 2012; 56: 755-767.

MissingFormLabel
PubMed Suche in Google Scholar
43 Hofner B, Kneib T, Hothorn T. A unified framework of constrained regression. Stat Comput 2014; 26 (Suppl. 01) 1-14.

MissingFormLabel
PubMed Suche in Google Scholar
44 Kneib T, Hothorn T, Tutz G. Variable Selection and Model Choice in Geoadditive Regression Models. Biometrics 2009; 65 (Suppl. 02) 626-634.

MissingFormLabel
PubMed Suche in Google Scholar
45 Tutz G, Binder H. Generalized Additive Modeling with Implicit Variable Selection by Likelihood-based Boosting. Biometrics 2006; 62: 961-971.

MissingFormLabel
PubMed Suche in Google Scholar
46 Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society (Series B) 2006; 68 (Suppl. 01) 49-67.

MissingFormLabel
PubMed Suche in Google Scholar
47 Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society (Series B) 2005; 67 (Suppl. 02) 301-320.

MissingFormLabel
PubMed Suche in Google Scholar
48 Candes E, Tao T. The Dantzig selector: Statistical estimation when p is much larger than n. Ann Statist 2007; 35 (Suppl. 06) 2313-2351.

MissingFormLabel
PubMed Suche in Google Scholar
49 Friedman J, Hastie T, Tibshirani R. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 2008; 9 (Suppl. 03) 432-441.

MissingFormLabel
PubMed Suche in Google Scholar
50 Gertheiss J, Hogger S, Oberhauser C, Tutz G. Selection of Ordinally Scaled Independent Variables with Applications to International Classification of Functioning Core Sets. Applied Statistics 2010; 60 (Suppl. 03) 377-395.

MissingFormLabel
PubMed Suche in Google Scholar
51 Tutz G, Gertheiss J. Feature Extraction in Signal Regression: A Boosting Technique for Functional Data Regression. J Comput Graph Stat 2010; 19: 154-174.

MissingFormLabel
PubMed Suche in Google Scholar
52 Wang Z. HingeBoost: ROC-Based Boost for Classification and Variable Selection. The International Journal of Biostatistics 2011; 7 (Suppl. 01) 1-30.

MissingFormLabel
PubMed Suche in Google Scholar
53 Mayr A, Binder H, Gefeller O, Schmid M. Extending statistical boosting: an overview of recent methodological developments. Meth Inf Med 2014; 53 (Suppl. 06) 428-435.

MissingFormLabel
PubMed Suche in Google Scholar

Zusatzmaterial

RSS-Feed abonnieren

Teilen / Bookmarken

Approaches to Regularized Regression – A Comparison between Gradient Boosting and the Lasso[*]

Publikationsverlauf

Originalartikel zu diesem Erratum:

Summary

Keywords

References

Approaches to Regularized Regression – A Comparison between Gradient Boosting and the Lasso^[*]